14 research outputs found

    A Cyber Threat Intelligence Sharing Scheme based on Federated Learning for Network Intrusion Detection

    Full text link
    The uses of Machine Learning (ML) in detection of network attacks have been effective when designed and evaluated in a single organisation. However, it has been very challenging to design an ML-based detection system by utilising heterogeneous network data samples originating from several sources. This is mainly due to privacy concerns and the lack of a universal format of datasets. In this paper, we propose a collaborative federated learning scheme to address these issues. The proposed framework allows multiple organisations to join forces in the design, training, and evaluation of a robust ML-based network intrusion detection system. The threat intelligence scheme utilises two critical aspects for its application; the availability of network data traffic in a common format to allow for the extraction of meaningful patterns across data sources. Secondly, the adoption of a federated learning mechanism to avoid the necessity of sharing sensitive users' information between organisations. As a result, each organisation benefits from other organisations cyber threat intelligence while maintaining the privacy of its data internally. The model is trained locally and only the updated weights are shared with the remaining participants in the federated averaging process. The framework has been designed and evaluated in this paper by using two key datasets in a NetFlow format known as NF-UNSW-NB15-v2 and NF-BoT-IoT-v2. Two other common scenarios are considered in the evaluation process; a centralised training method where the local data samples are shared with other organisations and a localised training method where no threat intelligence is shared. The results demonstrate the efficiency and effectiveness of the proposed framework by designing a universal ML model effectively classifying benign and intrusive traffic originating from multiple organisations without the need for local data exchange

    Towards a Standard Feature Set of NIDS Datasets

    Full text link
    Network Intrusion Detection Systems (NIDSs) datasets are essential tools used by researchers for the training and evaluation of Machine Learning (ML)-based NIDS models. There are currently five datasets, known as NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT, NF-CSE-CIC-IDS2018 and NF-UQ-NIDS, which are made up of a common feature set. However, their performances in classifying network traffic, mainly using the multi-classification method, is often unreliable. Therefore, this paper proposes a standard NetFlow feature set, to be used in future NIDS datasets due to the tremendous benefits of having a common feature set. NetFlow has been widely utilised in the networking industry for its practical scaling properties. The evaluation is done by extracting and labeling the proposed features from four well-known datasets. The newly generated datasets are known as NF- UNSW-NB15-v2, NF-BoT-IoT-v2, NF-ToN-IoT-v2, NF-CSE-CIC-IDS2018-v2 and NF-UQ-NIDS-v2. Their performances have been compared to their respective original datasets using an Extra Trees classifier, showing a great improvement in the attack detection accuracy. They have been made publicly available to use for research purposes.Comment: 13 pages, 4 figures, 13 tables. arXiv admin note: substantial text overlap with arXiv:2011.0914

    From Zero-Shot Machine Learning to Zero-Day Attack Detection

    Full text link
    The standard ML methodology assumes that the test samples are derived from a set of pre-observed classes used in the training phase. Where the model extracts and learns useful patterns to detect new data samples belonging to the same data classes. However, in certain applications such as Network Intrusion Detection Systems, it is challenging to obtain data samples for all attack classes that the model will most likely observe in production. ML-based NIDSs face new attack traffic known as zero-day attacks, that are not used in the training of the learning models due to their non-existence at the time. In this paper, a zero-shot learning methodology has been proposed to evaluate the ML model performance in the detection of zero-day attack scenarios. In the attribute learning stage, the ML models map the network data features to distinguish semantic attributes from known attack (seen) classes. In the inference stage, the models are evaluated in the detection of zero-day attack (unseen) classes by constructing the relationships between known attacks and zero-day attacks. A new metric is defined as Zero-day Detection Rate, which measures the effectiveness of the learning model in the inference stage. The results demonstrate that while the majority of the attack classes do not represent significant risks to organisations adopting an ML-based NIDS in a zero-day attack scenario. However, for certain attack groups identified in this paper, such systems are not effective in applying the learnt attributes of attack behaviour to detect them as malicious. Further Analysis was conducted using the Wasserstein Distance technique to measure how different such attacks are from other attack types used in the training of the ML model. The results demonstrate that sophisticated attacks with a low zero-day detection rate have a significantly distinct feature distribution compared to the other attack classes

    XG-BoT: An Explainable Deep Graph Neural Network for Botnet Detection and Forensics

    Full text link
    In this paper, we proposed XG-BoT, an explainable deep graph neural network model for botnet node detection. The proposed model is mainly composed of a botnet detector and an explainer for automatic forensics. The XG-BoT detector can effectively detect malicious botnet nodes under large-scale networks. Specifically, it utilizes a grouped reversible residual connection with a graph isomorphism network to learn expressive node representations from the botnet communication graphs. The explainer in XG-BoT can perform automatic network forensics by highlighting suspicious network flows and related botnet nodes. We evaluated XG-BoT on real-world, large-scale botnet network graphs. Overall, XG-BoT is able to outperform the state-of-the-art in terms of evaluation metrics. In addition, we show that the XG-BoT explainer can generate useful explanations based on GNNExplainer for automatic network forensics.Comment: 6 pages, 3 figure

    Exploring Edge TPU for Network Intrusion Detection in IoT

    Full text link
    This paper explores Google's Edge TPU for implementing a practical network intrusion detection system (NIDS) at the edge of IoT, based on a deep learning approach. While there are a significant number of related works that explore machine learning based NIDS for the IoT edge, they generally do not consider the issue of the required computational and energy resources. The focus of this paper is the exploration of deep learning-based NIDS at the edge of IoT, and in particular the computational and energy efficiency. In particular, the paper studies Google's Edge TPU as a hardware platform, and considers the following three key metrics: computation (inference) time, energy efficiency and the traffic classification performance. Various scaled model sizes of two major deep neural network architectures are used to investigate these three metrics. The performance of the Edge TPU-based implementation is compared with that of an energy efficient embedded CPU (ARM Cortex A53). Our experimental evaluation shows some unexpected results, such as the fact that the CPU significantly outperforms the Edge TPU for small model sizes.Comment: 22 pages, 11 figure

    E-GraphSAGE: A Graph Neural Network based Intrusion Detection System for IoT

    Full text link
    This paper presents a new Network Intrusion Detection System (NIDS) based on Graph Neural Networks (GNNs). GNNs are a relatively new sub-field of deep neural networks, which can leverage the inherent structure of graph-based data. Training and evaluation data for NIDSs are typically represented as flow records, which can naturally be represented in a graph format. This establishes the potential and motivation for exploring GNNs for network intrusion detection, which is the focus of this paper. Current studies on machine learning-based NIDSs only consider the network flows independently rather than taking their interconnected patterns into consideration. This is the key limitation in the detection of sophisticated IoT network attacks such as DDoS and distributed port scan attacks launched by IoT devices. In this paper, we propose \mbox{E-GraphSAGE}, a GNN approach that overcomes this limitation and allows capturing both the edge features of a graph as well as the topological information for network anomaly detection in IoT networks. To the best of our knowledge, our approach is the first successful, practical, and extensively evaluated approach of applying Graph Neural Networks on the problem of network intrusion detection for IoT using flow-based data. Our extensive experimental evaluation on four recent NIDS benchmark datasets shows that our approach outperforms the state-of-the-art in terms of key classification metrics, which demonstrates the potential of GNNs in network intrusion detection, and provides motivation for further research.Comment: 9 pages, 5 figures, 6 table

    Mortality from gastrointestinal congenital anomalies at 264 hospitals in 74 low-income, middle-income, and high-income countries: a multicentre, international, prospective cohort study

    Get PDF
    Summary Background Congenital anomalies are the fifth leading cause of mortality in children younger than 5 years globally. Many gastrointestinal congenital anomalies are fatal without timely access to neonatal surgical care, but few studies have been done on these conditions in low-income and middle-income countries (LMICs). We compared outcomes of the seven most common gastrointestinal congenital anomalies in low-income, middle-income, and high-income countries globally, and identified factors associated with mortality. Methods We did a multicentre, international prospective cohort study of patients younger than 16 years, presenting to hospital for the first time with oesophageal atresia, congenital diaphragmatic hernia, intestinal atresia, gastroschisis, exomphalos, anorectal malformation, and Hirschsprung’s disease. Recruitment was of consecutive patients for a minimum of 1 month between October, 2018, and April, 2019. We collected data on patient demographics, clinical status, interventions, and outcomes using the REDCap platform. Patients were followed up for 30 days after primary intervention, or 30 days after admission if they did not receive an intervention. The primary outcome was all-cause, in-hospital mortality for all conditions combined and each condition individually, stratified by country income status. We did a complete case analysis. Findings We included 3849 patients with 3975 study conditions (560 with oesophageal atresia, 448 with congenital diaphragmatic hernia, 681 with intestinal atresia, 453 with gastroschisis, 325 with exomphalos, 991 with anorectal malformation, and 517 with Hirschsprung’s disease) from 264 hospitals (89 in high-income countries, 166 in middleincome countries, and nine in low-income countries) in 74 countries. Of the 3849 patients, 2231 (58·0%) were male. Median gestational age at birth was 38 weeks (IQR 36–39) and median bodyweight at presentation was 2·8 kg (2·3–3·3). Mortality among all patients was 37 (39·8%) of 93 in low-income countries, 583 (20·4%) of 2860 in middle-income countries, and 50 (5·6%) of 896 in high-income countries (p<0·0001 between all country income groups). Gastroschisis had the greatest difference in mortality between country income strata (nine [90·0%] of ten in lowincome countries, 97 [31·9%] of 304 in middle-income countries, and two [1·4%] of 139 in high-income countries; p≤0·0001 between all country income groups). Factors significantly associated with higher mortality for all patients combined included country income status (low-income vs high-income countries, risk ratio 2·78 [95% CI 1·88–4·11], p<0·0001; middle-income vs high-income countries, 2·11 [1·59–2·79], p<0·0001), sepsis at presentation (1·20 [1·04–1·40], p=0·016), higher American Society of Anesthesiologists (ASA) score at primary intervention (ASA 4–5 vs ASA 1–2, 1·82 [1·40–2·35], p<0·0001; ASA 3 vs ASA 1–2, 1·58, [1·30–1·92], p<0·0001]), surgical safety checklist not used (1·39 [1·02–1·90], p=0·035), and ventilation or parenteral nutrition unavailable when needed (ventilation 1·96, [1·41–2·71], p=0·0001; parenteral nutrition 1·35, [1·05–1·74], p=0·018). Administration of parenteral nutrition (0·61, [0·47–0·79], p=0·0002) and use of a peripherally inserted central catheter (0·65 [0·50–0·86], p=0·0024) or percutaneous central line (0·69 [0·48–1·00], p=0·049) were associated with lower mortality. Interpretation Unacceptable differences in mortality exist for gastrointestinal congenital anomalies between lowincome, middle-income, and high-income countries. Improving access to quality neonatal surgical care in LMICs will be vital to achieve Sustainable Development Goal 3.2 of ending preventable deaths in neonates and children younger than 5 years by 2030

    An Explainable Machine Learning-based Network Intrusion Detection System for Enabling Generalisability in Securing IoT Networks

    Full text link
    Machine Learning (ML)-based network intrusion detection systems bring many benefits for enhancing the security posture of an organisation. Many systems have been designed and developed in the research community, often achieving a perfect detection rate when evaluated using certain datasets. However, the high number of academic research has not translated into practical deployments. There are a number of causes behind the lack of production usage. This paper tightens the gap by evaluating the generalisability of a common feature set to different network environments and attack types. Therefore, two feature sets (NetFlow and CICFlowMeter) have been evaluated across three datasets, i.e. CSE-CIC-IDS2018, BoT-IoT, and ToN-IoT. The results showed that the NetFlow feature set enhances the two ML models' detection accuracy in detecting intrusions across different datasets. In addition, due to the complexity of the learning models, the SHAP, an explainable AI methodology, has been adopted to explain and interpret the classification decisions of two ML models. The Shapley values of the features have been analysed across multiple datasets to determine the influence contributed by each feature towards the final ML prediction.Comment: 11 pages, 7 figure

    Inspection-L: A Self-Supervised GNN-Based Money Laundering Detection System for Bitcoin

    Full text link
    Criminals have become increasingly experienced in using cryptocurrencies, such as Bitcoin, for money laundering. The use of cryptocurrencies can hide criminal identities and transfer hundreds of millions of dollars of dirty funds through their criminal digital wallets. However, this is considered a paradox because cryptocurrencies are gold mines for open-source intelligence, allowing law enforcement agencies to have more power in conducting forensic analyses. This paper proposed Inspection-L, a graph neural network (GNN) framework based on self-supervised Deep Graph Infomax (DGI), with supervised learning algorithms, namely Random Forest (RF) to detect illicit transactions for AML. To the best of our knowledge, our proposal is the first of applying self-supervised GNNs to the problem of AML in Bitcoin. The proposed method has been evaluated on the Elliptic dataset and shows that our approach outperforms the baseline in terms of key classification metrics, which demonstrates the potential of self-supervised GNN in cryptocurrency illicit transaction detection
    corecore